Project done by Michela Pirozzi MAT:732531 and Sara Ferioli MAT:733105
Given the features in the dataset, what is the best combination to have a new contract? Was the feature combination the same even before covid?
Predict number of contracts. Is the number of contracts predicted for the future in 2019 the same or similar to the actual data given the occurrence of COVID?
# Load dataset
attivati = pd.read_csv("Rapporti_di_lavoro_attivati.csv")
attivati.head()
| DATA | GENERE | ETA | SETTOREECONOMICODETTAGLIO | TITOLOSTUDIO | CONTRATTO | MODALITALAVORO | PROVINCIAIMPRESA | ITALIANO | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 09/05/2020 | F | 60 | Attività di famiglie e convivenze come datori ... | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO | TEMPO PIENO | BERGAMO | UCRAINA |
| 1 | 12/07/2019 | M | 43 | Gestioni di funicolari, ski-lift e seggiovie s... | LICENZA MEDIA | LAVORO A TEMPO DETERMINATO | TEMPO PIENO | BERGAMO | ITALIA |
| 2 | 05/06/2013 | F | 20 | Fabbricazione di altre apparecchiature elettri... | LICENZA MEDIA | APPRENDISTATO PROFESSIONALIZZANTE O CONTRATTO ... | TEMPO PIENO | BERGAMO | ITALIA |
| 3 | 12/03/2010 | F | 28 | Alberghi | DIPLOMA DI ISTRUZIONE SECONDARIA SUPERIORE CH... | LAVORO INTERMITTENTE A TEMPO DETERMINATO | NON DEFINITO | BERGAMO | ITALIA |
| 4 | 06/04/2021 | F | 49 | Rifugi di montagna | LICENZA MEDIA | LAVORO INTERMITTENTE | NON DEFINITO | BERGAMO | ITALIA |
# Show the graph
time_for_column(dati,"CONTRATTO","DATA")
# Show the graph
px.bar(df_merge_col, 'SETTOREECONOMICODETTAGLIO', 'count',
color='SETTOREECONOMICODETTAGLIO', animation_frame='DATA',
category_orders={'DATA':['2018', '2019', '2020', '2021']}, title='', range_y=[0, 60000])
We can see that housework ('Attività di famiglie e convivenze come datori di lavoro per personale domestico') during the covid has increased. Instead, hotel ('Alberghi') and catering ('Ristorazione con somministrazione') sector have decreased.
# Show the graph
map_fig(dati,"CONTRATTO")
fig.show()
# Show the graph
line_fig(prorogati_merge_col,cessati_merge_col,"DATA","prorogati","cessati")
print(f'The sector with the max number of activated contracts in 2020 is "{idmax_a20}" with {max_a20} contracts')
print(f'The sector with the min number of activated contracts in 2020 is "{idmin_a20}" with {min_a20} contracts')
print(f'The sector with the max number of activated contracts in 2021 is "{idmax_a21}" with {max_a21} contracts')
print(f'The sector with the min number of activated contracts in 2021 is "{idmin_a21}" with {min_a21} contracts')
The sector with the max number of activated contracts in 2020 is "Attività di famiglie e convivenze come datori di lavoro per personale domestico" with 49228 contracts The sector with the min number of activated contracts in 2020 is "Trasporto mediante condotte di liquidi" with 1 contracts The sector with the max number of activated contracts in 2021 is "Attività di produzione cinematografica, di video e di programmi televisivi" with 33020 contracts The sector with the min number of activated contracts in 2021 is "Trasporto mediante condotte di liquidi" with 1 contracts
join.head()
| GENERE | ETA | TITOLOSTUDIO | CONTRATTO | MODALITALAVORO | PROVINCIAIMPRESA | ITALIANO | mese-anno | ANNO | Codice_ateco | SETTOREECONOMICODETTAGLIO_y | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | F | 60 | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO | TEMPO PIENO | BERGAMO | UCRAINA | 2020-05 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... |
| 1 | F | 33 | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO A TEMPO DETERMINATO | TEMPO PARZIALE ORIZZONTALE | BERGAMO | HONDURAS | 2012-07 | 2012 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... |
| 2 | F | 45 | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO | TEMPO PIENO | BERGAMO | ITALIA | 2019-04 | 2019 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... |
| 3 | F | 61 | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO | TEMPO PIENO | LECCO | UCRAINA | 2014-09 | 2014 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... |
| 4 | F | 20 | NESSUN TITOLO DI STUDIO | LAVORO DOMESTICO | TEMPO PARZIALE ORIZZONTALE | LECCO | ITALIA | 2014-05 | 2014 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... |
indeterminato.head()
| GENERE | ETA | TITOLOSTUDIO | CONTRATTO | MODALITALAVORO | PROVINCIAIMPRESA | mese-anno | ANNO | Codice_ateco | SETTOREECONOMICODETTAGLIO_y | ITALIANO | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | F | 60 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | BERGAMO | 2020-05 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA |
| 1 | F | 61 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | LECCO | 2014-09 | 2014 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA |
| 2 | F | 29 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | LECCO | 2017-06 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA |
| 3 | F | 48 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2017-08 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA |
| 4 | F | 54 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2020-01 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA |
copia.head()
| GENERE | ETA | TITOLOSTUDIO | CONTRATTO | MODALITALAVORO | PROVINCIAIMPRESA | mese-anno | ANNO | Codice_ateco | SETTOREECONOMICODETTAGLIO_y | ITALIANO | titolostudio_transformed | modalitalavoro_transformed | provincia_transformed | nazionalita_transformed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | F | 60 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | BERGAMO | 2020-05 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 4 | 0 | 57 |
| 1 | F | 61 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | LECCO | 2014-09 | 2014 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 4 | 4 | 57 |
| 2 | F | 29 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | LECCO | 2017-06 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 4 | 57 |
| 3 | F | 48 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2017-08 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 1 | 57 |
| 4 | F | 54 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2020-01 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 1 | 57 |
copia.head()
| GENERE | ETA | TITOLOSTUDIO | CONTRATTO | MODALITALAVORO | PROVINCIAIMPRESA | mese-anno | ANNO | Codice_ateco | SETTOREECONOMICODETTAGLIO_y | ITALIANO | titolostudio_transformed | modalitalavoro_transformed | provincia_transformed | nazionalita_transformed | contratto_transformed | genere_transformed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | F | 60 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | BERGAMO | 2020-05 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 4 | 0 | 57 | 0 | 1 |
| 1 | F | 61 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PIENO | LECCO | 2014-09 | 2014 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 4 | 4 | 57 | 0 | 1 |
| 2 | F | 29 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | LECCO | 2017-06 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 4 | 57 | 0 | 1 |
| 3 | F | 48 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2017-08 | 2017 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 1 | 57 | 0 | 1 |
| 4 | F | 54 | NESSUN TITOLO DI STUDIO | NON INDETERMINATO | TEMPO PARZIALE ORIZZONTALE | BRESCIA | 2020-01 | 2020 | 97 | ATTIVITÀ DI FAMIGLIE E CONVIVENZE COME DATORI ... | UCRAINA | 8 | 2 | 1 | 57 | 0 | 1 |
transformed.head()
| ETA | mese-anno | ANNO | Codice_ateco | titolostudio_transformed | modalitalavoro_transformed | provincia_transformed | nazionalita_transformed | contratto_transformed | genere_transformed | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 60 | 2020-05 | 2020 | 97 | 8 | 4 | 0 | 57 | 0 | 1 |
| 1 | 61 | 2014-09 | 2014 | 97 | 8 | 4 | 4 | 57 | 0 | 1 |
| 2 | 29 | 2017-06 | 2017 | 97 | 8 | 2 | 4 | 57 | 0 | 1 |
| 3 | 48 | 2017-08 | 2017 | 97 | 8 | 2 | 1 | 57 | 0 | 1 |
| 4 | 54 | 2020-01 | 2020 | 97 | 8 | 2 | 1 | 57 | 0 | 1 |
# Prima del Balancing
# Check if the dataset is balanced
transformed["contratto_transformed"].value_counts(normalize=True)
0 0.855403 1 0.144597 Name: contratto_transformed, dtype: float64
# Check if the dataset is balanced
balanced["contratto_transformed"].value_counts(normalize=True)
0 0.5 1 0.5 Name: contratto_transformed, dtype: float64
# Find the max depth neccessary for the Decision tree
find_best_max_depth(X_train,y_train,X_test,y_test)
# Show Decision tree
plt.figure(figsize=(4,4), dpi=1000)
plot_tree(dct,
feature_names=["ETA","ANNO","contratto_transformed","nazionalita_transformed","genere_transformed"],
filled=True,)
plt.show()
# Probablistic prediction the values
y_pred_prob = dct.predict_proba(X_test)
print(y_pred_prob)
[[0.01872659 0.00374532 0. ... 0.01123596 0.00374532 0. ] [0.00722022 0. 0. ... 0.00120337 0. 0. ] [0.01372549 0. 0. ... 0.01470588 0.00588235 0. ] ... [0.00215517 0. 0. ... 0.01508621 0. 0. ] [0.00923788 0. 0. ... 0.00923788 0. 0. ] [0. 0. 0. ... 0. 0. 0. ]]
# Predict the values
y_pred = dct.predict(X_test)
print(y_pred)
[53 62 85 ... 85 41 41]
# Print the confusion matrix
confusion = metrics.confusion_matrix(y_test, y_pred)
print(confusion)
[[ 943 0 0 ... 16 385 0] [ 8 0 0 ... 0 0 0] [ 2 0 2 ... 0 1 0] ... [ 64 2 0 ... 2511 133 1] [ 593 0 1 ... 814 4921 21] [ 5 0 0 ... 2 21 0]]
# Show the confusion matrix
fig=px.imshow(normalized,color_continuous_scale='blues')
fig.show()
# Prediction with our data:
#23 = age
#2021 = year in which you want the contract
#1 = indefinite contract
#27 = Italy
#1 = female
X_mie=[[23,2021,1,27,1]]
# Probablistic prediction the values
y_pred_prob_mie = dct.predict_proba(X_mie)
print(y_pred_prob_mie)
y_pred_prob_mie = pd.DataFrame(y_pred_prob_mie)
y_conv=pd.DataFrame(dct.classes_)
app= pd.concat([y_pred_prob_mie,y_conv], axis=1)
df=pd.DataFrame(app)
df= df.dropna()
df=df.loc[:, (df != 0).any(axis=0)]
if (df.iloc[: , -1:].columns == 0):
df = df.iloc[: , :-1]
df
[[0. 0. 0. 0. 0. 0. 0. 0. 0.01302083 0. 0. 0.00260417 0.00520833 0. 0. 0.00260417 0.00260417 0. 0.0078125 0.00520833 0.01822917 0. 0. 0.03125 0.00260417 0.00520833 0.01302083 0.01041667 0. 0.00520833 0.00260417 0. 0. 0. 0. 0. 0. 0.00520833 0. 0.01041667 0.00260417 0.03125 0.05989583 0.015625 0. 0.00260417 0.02864583 0.02604167 0.0078125 0.0859375 0.00260417 0. 0.00260417 0.00260417 0.03385417 0.0078125 0.015625 0. 0.0078125 0.0078125 0.02083333 0.0078125 0.0078125 0.00260417 0.00260417 0.0078125 0. 0. 0. 0.00260417 0. 0.00260417 0.0078125 0.01822917 0.02083333 0.26041667 0.02604167 0.07552083 0. 0. 0. 0. 0.0078125 0. 0.04427083 0. 0. ]]
| 8 | 11 | 12 | 15 | 16 | 18 | 19 | 20 | 23 | 24 | ... | 69 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 82 | 84 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.013021 | 0.002604 | 0.005208 | 0.002604 | 0.002604 | 0.007812 | 0.005208 | 0.018229 | 0.03125 | 0.002604 | ... | 0.002604 | 0.002604 | 0.007812 | 0.018229 | 0.020833 | 0.260417 | 0.026042 | 0.075521 | 0.007812 | 0.044271 |
1 rows × 50 columns
# Sectors in which there is a possibility to get a new contract
sett_pred = []
for x in df.columns:
el = ateco[ateco["Codice_ateco"] == str(x)]
sett_pred.extend(el["SETTOREECONOMICODETTAGLIO"])
sett_pred
['INDUSTRIA DELLE BEVANDE', 'INDUSTRIA DEL TABACCO', 'FABBRICAZIONE DI ARTICOLI IN PELLE E SIMILI', 'INDUSTRIA DEL LEGNO E DEI PRODOTTI IN LEGNO E SUGHERO (ESCLUSI I MOBILI); FABBRICAZIONE DI ARTICOLI IN PAGLIA E MATERIALI DA INTRECCIO', 'STAMPA E RIPRODUZIONE DI SUPPORTI REGISTRATI', 'FABBRICAZIONE DI COKE E PRODOTTI DERIVANTI DALLA RAFFINAZIONE DEL PETROLIO', 'FABBRICAZIONE DI PRODOTTI CHIMICI', 'FABBRICAZIONE DI ALTRI PRODOTTI DELLA LAVORAZIONE DI MINERALI NON METALLIFERI', 'METALLURGIA', 'FABBRICAZIONE DI PRODOTTI IN METALLO (ESCLUSI MACCHINARI E ATTREZZATURE)', 'FABBRICAZIONE DI COMPUTER E PRODOTTI DI ELETTRONICA E OTTICA; APPARECCHI ELETTROMEDICALI, APPARECCHI DI MISURAZIONE E DI OROLOGI', 'FABBRICAZIONE DI APPARECCHIATURE ELETTRICHE ED APPARECCHIATURE PER USO DOMESTICO NON ELETTRICHE', 'FABBRICAZIONE DI AUTOVEICOLI, RIMORCHI E SEMIRIMORCHI', 'FABBRICAZIONE DI ALTRI MEZZI DI TRASPORTO', 'GESTIONE DELLE RETI FOGNARIE', 'ATTIVITÀ DI RISANAMENTO E ALTRI SERVIZI DI GESTIONE DEI RIFIUTI', 'COSTRUZIONE DI EDIFICI', 'INGEGNERIA CIVILE', 'LAVORI DI COSTRUZIONE SPECIALIZZATI', "COMMERCIO ALL'INGROSSO E AL DETTAGLIO E RIPARAZIONE DI AUTOVEICOLI E MOTOCICLI", "COMMERCIO ALL'INGROSSO (ESCLUSO QUELLO DI AUTOVEICOLI E DI MOTOCICLI)", 'COMMERCIO AL DETTAGLIO (ESCLUSO QUELLO DI AUTOVEICOLI E DI MOTOCICLI)', 'TRASPORTO TERRESTRE E TRASPORTO MEDIANTE CONDOTTE', "TRASPORTO MARITTIMO E PER VIE D'ACQUA", 'MAGAZZINAGGIO E ATTIVITÀ DI SUPPORTO AI TRASPORTI', 'SERVIZI POSTALI E ATTIVITÀ DI CORRIERE', 'ALLOGGIO', 'ATTIVITÀ DEI SERVIZI DI RISTORAZIONE', 'ATTIVITÀ EDITORIALI', 'ATTIVITÀ DI PRODUZIONE CINEMATOGRAFICA, DI VIDEO E DI PROGRAMMI TELEVISIVI, DI REGISTRAZIONI MUSICALI E SONORE', 'ATTIVITÀ DI PROGRAMMAZIONE E TRASMISSIONE', 'TELECOMUNICAZIONI', 'PRODUZIONE DI SOFTWARE, CONSULENZA INFORMATICA E ATTIVITÀ CONNESSE', "ATTIVITÀ DEI SERVIZI D'INFORMAZIONE E ALTRI SERVIZI INFORMATICI", 'ATTIVITÀ DI SERVIZI FINANZIARI (ESCLUSE LE ASSICURAZIONI E I FONDI PENSIONE)', 'ASSICURAZIONI, RIASSICURAZIONI E FONDI PENSIONE (ESCLUSE LE ASSICURAZIONI SOCIALI OBBLIGATORIE)', 'ATTIVITÀ LEGALI E CONTABILITÀ', "ATTIVITÀ DEGLI STUDI DI ARCHITETTURA E D'INGEGNERIA; COLLAUDI ED ANALISI TECNICHE", 'RICERCA SCIENTIFICA E SVILUPPO', 'PUBBLICITÀ E RICERCHE DI MERCATO', 'ALTRE ATTIVITÀ PROFESSIONALI, SCIENTIFICHE E TECNICHE', 'SERVIZI VETERINARI', 'ATTIVITÀ DI NOLEGGIO E LEASING OPERATIVO', "ATTIVITÀ DI SUPPORTO PER LE FUNZIONI D'UFFICIO E ALTRI SERVIZI DI SUPPORTO ALLE IMPRESE", 'AMMINISTRAZIONE PUBBLICA E DIFESA; ASSICURAZIONE SOCIALE OBBLIGATORIA']
len(sett_pred)
45
# Prediction with our data:
#23 = age
#2021 = year in which you want the contract
#1 = indefinite contract
#27 = Italy
#0 = male
X_male=[[23,2021,1,27,0]]
# Probablistic prediction the values
y_pred_prob_male = dct.predict_proba(X_male)
print(y_pred_prob_male)
y_pred_prob_male = pd.DataFrame(y_pred_prob_male)
y_conv=pd.DataFrame(dct.classes_)
app= pd.concat([y_pred_prob_male,y_conv], axis=1)
df=pd.DataFrame(app)
df= df.dropna()
df=df.loc[:, (df != 0).any(axis=0)]
if (df.iloc[: , -1:].columns == 0):
df = df.iloc[: , :-1]
df
[[0.00568182 0. 0. 0. 0. 0. 0. 0. 0.01325758 0. 0. 0.00378788 0.00189394 0. 0.00189394 0.00568182 0.00378788 0. 0.02083333 0.00378788 0.03598485 0.0094697 0.00757576 0.06818182 0.00568182 0.01515152 0.07386364 0.01136364 0.00568182 0.00378788 0.00568182 0.0094697 0.00189394 0.00378788 0. 0.0094697 0. 0.01704545 0.00568182 0.04356061 0.01515152 0.05681818 0.02651515 0.02840909 0. 0.00189394 0.02840909 0.09280303 0.00189394 0.05681818 0.00378788 0.00189394 0. 0. 0.11363636 0.00568182 0.01325758 0. 0.0094697 0.00189394 0.00378788 0.00568182 0.00568182 0.00378788 0.00568182 0.00378788 0. 0.00568182 0.00189394 0. 0.00378788 0.01515152 0.01325758 0.02272727 0. 0.03030303 0.00189394 0.00757576 0.00189394 0. 0. 0.00378788 0.00189394 0.00189394 0.00568182 0.00189394 0. ]]
| 0 | 8 | 11 | 12 | 14 | 15 | 16 | 18 | 19 | 20 | ... | 73 | 75 | 76 | 77 | 78 | 81 | 82 | 83 | 84 | 85 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.005682 | 0.013258 | 0.003788 | 0.001894 | 0.001894 | 0.005682 | 0.003788 | 0.020833 | 0.003788 | 0.035985 | ... | 0.022727 | 0.030303 | 0.001894 | 0.007576 | 0.001894 | 0.003788 | 0.001894 | 0.001894 | 0.005682 | 0.001894 |
1 rows × 64 columns
# Sectors in which there is a possibility to get a new contract
sett_pred = []
for x in df.columns:
el = ateco[ateco["Codice_ateco"] == str(x)]
sett_pred.extend(el["SETTOREECONOMICODETTAGLIO"])
sett_pred
['INDUSTRIA DELLE BEVANDE', 'INDUSTRIA DEL TABACCO', 'CONFEZIONE DI ARTICOLI DI ABBIGLIAMENTO; CONFEZIONE DI ARTICOLI IN PELLE E PELLICCIA', 'FABBRICAZIONE DI ARTICOLI IN PELLE E SIMILI', 'INDUSTRIA DEL LEGNO E DEI PRODOTTI IN LEGNO E SUGHERO (ESCLUSI I MOBILI); FABBRICAZIONE DI ARTICOLI IN PAGLIA E MATERIALI DA INTRECCIO', 'STAMPA E RIPRODUZIONE DI SUPPORTI REGISTRATI', 'FABBRICAZIONE DI COKE E PRODOTTI DERIVANTI DALLA RAFFINAZIONE DEL PETROLIO', 'FABBRICAZIONE DI PRODOTTI CHIMICI', 'FABBRICAZIONE DI PRODOTTI FARMACEUTICI DI BASE E DI PREPARATI FARMACEUTICI', 'FABBRICAZIONE DI ARTICOLI IN GOMMA E MATERIE PLASTICHE', 'FABBRICAZIONE DI ALTRI PRODOTTI DELLA LAVORAZIONE DI MINERALI NON METALLIFERI', 'METALLURGIA', 'FABBRICAZIONE DI PRODOTTI IN METALLO (ESCLUSI MACCHINARI E ATTREZZATURE)', 'FABBRICAZIONE DI COMPUTER E PRODOTTI DI ELETTRONICA E OTTICA; APPARECCHI ELETTROMEDICALI, APPARECCHI DI MISURAZIONE E DI OROLOGI', 'FABBRICAZIONE DI APPARECCHIATURE ELETTRICHE ED APPARECCHIATURE PER USO DOMESTICO NON ELETTRICHE', 'FABBRICAZIONE DI MACCHINARI ED APPARECCHIATURE NCA', 'FABBRICAZIONE DI AUTOVEICOLI, RIMORCHI E SEMIRIMORCHI', 'FABBRICAZIONE DI ALTRI MEZZI DI TRASPORTO', 'FABBRICAZIONE DI MOBILI', 'ALTRE INDUSTRIE MANIFATTURIERE', 'RIPARAZIONE, MANUTENZIONE ED INSTALLAZIONE DI MACCHINE ED APPARECCHIATURE', 'FORNITURA DI ENERGIA ELETTRICA, GAS, VAPORE E ARIA CONDIZIONATA', 'GESTIONE DELLE RETI FOGNARIE', 'ATTIVITÀ DI RACCOLTA, TRATTAMENTO E SMALTIMENTO DEI RIFIUTI; RECUPERO DEI MATERIALI', 'ATTIVITÀ DI RISANAMENTO E ALTRI SERVIZI DI GESTIONE DEI RIFIUTI', 'COSTRUZIONE DI EDIFICI', 'INGEGNERIA CIVILE', 'LAVORI DI COSTRUZIONE SPECIALIZZATI', "COMMERCIO ALL'INGROSSO E AL DETTAGLIO E RIPARAZIONE DI AUTOVEICOLI E MOTOCICLI", "COMMERCIO ALL'INGROSSO (ESCLUSO QUELLO DI AUTOVEICOLI E DI MOTOCICLI)", 'COMMERCIO AL DETTAGLIO (ESCLUSO QUELLO DI AUTOVEICOLI E DI MOTOCICLI)', 'TRASPORTO TERRESTRE E TRASPORTO MEDIANTE CONDOTTE', "TRASPORTO MARITTIMO E PER VIE D'ACQUA", 'TRASPORTO AEREO', 'ALLOGGIO', 'ATTIVITÀ DEI SERVIZI DI RISTORAZIONE', 'ATTIVITÀ EDITORIALI', 'ATTIVITÀ DI PRODUZIONE CINEMATOGRAFICA, DI VIDEO E DI PROGRAMMI TELEVISIVI, DI REGISTRAZIONI MUSICALI E SONORE', 'ATTIVITÀ DI PROGRAMMAZIONE E TRASMISSIONE', 'TELECOMUNICAZIONI', 'PRODUZIONE DI SOFTWARE, CONSULENZA INFORMATICA E ATTIVITÀ CONNESSE', "ATTIVITÀ DEI SERVIZI D'INFORMAZIONE E ALTRI SERVIZI INFORMATICI", 'ATTIVITÀ DI SERVIZI FINANZIARI (ESCLUSE LE ASSICURAZIONI E I FONDI PENSIONE)', 'ASSICURAZIONI, RIASSICURAZIONI E FONDI PENSIONE (ESCLUSE LE ASSICURAZIONI SOCIALI OBBLIGATORIE)', 'ATTIVITÀ IMMOBILIARI', 'ATTIVITÀ DI DIREZIONE AZIENDALE E DI CONSULENZA GESTIONALE ', "ATTIVITÀ DEGLI STUDI DI ARCHITETTURA E D'INGEGNERIA; COLLAUDI ED ANALISI TECNICHE", 'RICERCA SCIENTIFICA E SVILUPPO', 'PUBBLICITÀ E RICERCHE DI MERCATO', 'SERVIZI VETERINARI', 'ATTIVITÀ DI NOLEGGIO E LEASING OPERATIVO', 'ATTIVITÀ DI RICERCA, SELEZIONE, FORNITURA DI PERSONALE ', 'ATTIVITÀ DI SERVIZI PER EDIFICI E PAESAGGIO', "ATTIVITÀ DI SUPPORTO PER LE FUNZIONI D'UFFICIO E ALTRI SERVIZI DI SUPPORTO ALLE IMPRESE", 'AMMINISTRAZIONE PUBBLICA E DIFESA; ASSICURAZIONE SOCIALE OBBLIGATORIA', 'ISTRUZIONE']
len(sett_pred)
56
plt.figure(figsize=(4,4), dpi=1000)
plot_tree(best_max_depth_tree,
feature_names=["ETA","ANNO","contratto_transformed","nazionalita_transformed","genere_transformed"],
filled=True,)
plt.show()
# Show the best parameter
print(max_depth_grid_search.best_params_)
{'max_depth': 12}
# Print the confusion matrix
confusion = metrics.confusion_matrix(y_train, y_pred)
print(confusion)
[[ 5320 24 8 ... 35 909 0] [ 62 15 2 ... 0 5 0] [ 12 0 8 ... 0 2 0] ... [ 667 1 5 ... 3503 926 0] [ 2070 9 1 ... 344 50456 0] [ 31 0 0 ... 3 54 0]]
# Show the confusion matrix
fig=px.imshow(normalized,color_continuous_scale='blues')
fig.show()
# Show count of contract for each mese-anno
f, ax1 = plt.subplots(1,1,figsize=(15,5))
bal2.plot(ax=ax1)
ax1.set_xlabel("time")
ax1.set_ylabel("Number of Contracts")
plt.grid(True)
print('ADF Statistic: %f' % results[0])
print('p-value: %f' % results[1])
ADF Statistic: -4.613666 p-value: 0.000122
fig, ax = plt.subplots(4, 1, figsize=(15, 6))
decomposed_add.observed.plot(ax = ax[0])
decomposed_add.trend.plot(ax = ax[1])
decomposed_add.seasonal.plot(ax = ax[2])
decomposed_add.resid.plot(ax = ax[3])
ax[0].set_ylabel('')
ax[1].set_ylabel('Trend')
ax[2].set_ylabel('Seasonal')
ax[3].set_ylabel('Residual')
plt.tight_layout()
plt.show()
plt.figure(figsize=(12,5))
ax1 = bal_diff.plot()
ax1.set_xlabel("Anno")
ax1.set_ylabel("diff")
plt.grid(True)
plt.show()
print('ADF Statistic: %f' % results[0])
print('p-value: %f' % results[1])
ADF Statistic: -2.987232 p-value: 0.036104
print('ADF Statistic: %f' % results[0])
print("P-value of a test is: {}".format(results[1]))
ADF Statistic: -7.801078 P-value of a test is: 7.490497084599864e-12
# Show autocorrelation and partial correlation
fig,ax = plt.subplots(2,1,figsize=(20,10))
plot_acf(bal_diff, lags=4, ax=ax[0])
plot_pacf(bal_diff, lags=4, ax=ax[1])
plt.show()
arima_df = arima_aic(bal2)
arima_df
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 1 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.53563D+00 |proj g|= 0.00000D+00
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
1 0 1 0 0 0 0.000D+00 9.536D+00
F = 9.5356311497057664
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 2 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.41030D+00 |proj g|= 1.82112D-03
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
2 4 7 1 0 0 1.776D-07 9.410D+00
F = 9.4102933631734142
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.39647D+00 |proj g|= 1.64544D-03
At iterate 5 f= 9.39646D+00 |proj g|= 3.55272D-07
At iterate 10 f= 9.39646D+00 |proj g|= 8.17124D-06
At iterate 15 f= 9.39646D+00 |proj g|= 2.62901D-05
At iterate 20 f= 9.39646D+00 |proj g|= 2.54019D-05
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 24 36 1 0 0 1.776D-07 9.396D+00
F = 9.3964614257279546
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.38227D+00 |proj g|= 1.75220D-03
At iterate 5 f= 9.38225D+00 |proj g|= 5.32907D-07
At iterate 10 f= 9.38225D+00 |proj g|= 2.38032D-05
At iterate 15 f= 9.38225D+00 |proj g|= 3.30402D-05
At iterate 20 f= 9.38225D+00 |proj g|= 6.18172D-05
This problem is unconstrained. This problem is unconstrained. This problem is unconstrained. This problem is unconstrained.
At iterate 25 f= 9.38225D+00 |proj g|= 0.00000D+00
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 25 34 1 0 0 0.000D+00 9.382D+00
F = 9.3822507141105049
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 2 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.49349D+00 |proj g|= 8.28138D-04
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
2 2 4 1 0 0 0.000D+00 9.493D+00
F = 9.4934924944244568
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.40371D+00 |proj g|= 1.77138D-03
At iterate 5 f= 9.40370D+00 |proj g|= 7.10543D-07
At iterate 10 f= 9.40370D+00 |proj g|= 1.68754D-05
At iterate 15 f= 9.40370D+00 |proj g|= 8.17124D-06
At iterate 20 f= 9.40370D+00 |proj g|= 5.64881D-05
At iterate 25 f= 9.40370D+00 |proj g|= 1.77636D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 25 40 1 0 0 1.776D-07 9.404D+00
F = 9.4036971479741140
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.39168D+00 |proj g|= 1.52287D-03
At iterate 5 f= 9.39167D+00 |proj g|= 2.48690D-06
This problem is unconstrained. This problem is unconstrained. This problem is unconstrained.
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 8 12 1 0 0 0.000D+00 9.392D+00
F = 9.3916740930360216
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 5 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.36722D+00 |proj g|= 1.61098D-03
At iterate 5 f= 9.36719D+00 |proj g|= 5.55289D-04
At iterate 10 f= 9.36719D+00 |proj g|= 7.10543D-06
At iterate 15 f= 9.36719D+00 |proj g|= 1.41398D-04
At iterate 20 f= 9.36715D+00 |proj g|= 1.10987D-03
At iterate 25 f= 9.36712D+00 |proj g|= 3.16192D-05
At iterate 30 f= 9.36712D+00 |proj g|= 3.55271D-07
This problem is unconstrained. This problem is unconstrained.
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
5 32 43 1 0 0 1.776D-07 9.367D+00
F = 9.3671181930709899
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.39460D+00 |proj g|= 2.46381D-03
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 4 6 1 0 0 0.000D+00 9.395D+00
F = 9.3945834705319768
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.37144D+00 |proj g|= 2.43237D-03
At iterate 5 f= 9.37142D+00 |proj g|= 7.10543D-06
At iterate 10 f= 9.37142D+00 |proj g|= 5.32907D-07
At iterate 15 f= 9.37142D+00 |proj g|= 5.86198D-06
At iterate 20 f= 9.37142D+00 |proj g|= 3.73035D-06
At iterate 25 f= 9.37142D+00 |proj g|= 4.08562D-06
At iterate 30 f= 9.37142D+00 |proj g|= 1.33227D-05
This problem is unconstrained. This problem is unconstrained.
At iterate 35 f= 9.37142D+00 |proj g|= 3.49942D-05
At iterate 40 f= 9.37142D+00 |proj g|= 5.32907D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 41 58 1 0 0 3.553D-07 9.371D+00
F = 9.3714200543317165
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 5 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.37136D+00 |proj g|= 2.41887D-03
At iterate 5 f= 9.37134D+00 |proj g|= 1.66800D-04
At iterate 10 f= 9.37134D+00 |proj g|= 2.30926D-06
At iterate 15 f= 9.37134D+00 |proj g|= 2.13163D-06
At iterate 20 f= 9.37134D+00 |proj g|= 3.67706D-05
At iterate 25 f= 9.37134D+00 |proj g|= 1.27898D-05
At iterate 30 f= 9.37134D+00 |proj g|= 3.55271D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
5 30 40 1 0 0 3.553D-07 9.371D+00
F = 9.3713416165056049
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
This problem is unconstrained. This problem is unconstrained.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.36681D+00 |proj g|= 2.47837D-03
At iterate 5 f= 9.36679D+00 |proj g|= 8.88179D-07
At iterate 10 f= 9.36679D+00 |proj g|= 2.59348D-05
At iterate 15 f= 9.36679D+00 |proj g|= 9.36140D-05
At iterate 20 f= 9.36679D+00 |proj g|= 1.77636D-06
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 23 39 1 0 0 7.105D-07 9.367D+00
F = 9.3667912594171181
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 5 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.35674D+00 |proj g|= 3.35003D-03
At iterate 5 f= 9.35671D+00 |proj g|= 3.32179D-05
At iterate 10 f= 9.35671D+00 |proj g|= 1.77636D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
5 11 14 1 0 0 0.000D+00 9.357D+00
F = 9.3567121127836508
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 7 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.33916D+00 |proj g|= 4.07532D-03
At iterate 5 f= 9.33911D+00 |proj g|= 7.81419D-04
At iterate 10 f= 9.33910D+00 |proj g|= 1.47971D-04
At iterate 15 f= 9.33910D+00 |proj g|= 5.50671D-06
At iterate 20 f= 9.33910D+00 |proj g|= 2.48690D-06
At iterate 25 f= 9.33910D+00 |proj g|= 3.19744D-05
At iterate 30 f= 9.33910D+00 |proj g|= 1.11378D-04
At iterate 35 f= 9.33910D+00 |proj g|= 3.01981D-06
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
7 37 48 1 0 0 1.776D-06 9.339D+00
F = 9.3391012752206422
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
This problem is unconstrained.
| p | q | aic | bic | sum_aic_bic | |
|---|---|---|---|---|---|
| 0 | 0 | 0 | 3284.257115 | 3290.552104 | 6574.80922 |
| 4 | 1 | 0 | 3271.761418 | 3281.203902 | 6552.96532 |
| 1 | 0 | 1 | 3243.140917 | 3252.5834 | 6495.724317 |
| 5 | 1 | 1 | 3242.871819 | 3255.461797 | 6498.333616 |
| 6 | 1 | 2 | 3240.735888 | 3256.47336 | 6497.209248 |
| 2 | 0 | 2 | 3240.38273 | 3252.972708 | 6493.355439 |
| 8 | 2 | 0 | 3239.736714 | 3252.326692 | 6492.063406 |
| 3 | 0 | 3 | 3237.494246 | 3253.231718 | 6490.725964 |
| 10 | 2 | 2 | 3235.741516 | 3254.626483 | 6490.367999 |
| 7 | 1 | 3 | 3234.288658 | 3253.173625 | 6487.462284 |
| 9 | 2 | 1 | 3233.768499 | 3249.505971 | 6483.27447 |
| 12 | 3 | 0 | 3232.176193 | 3247.913666 | 6480.089859 |
| 13 | 3 | 1 | 3230.708967 | 3249.593934 | 6480.3029 |
| 15 | 3 | 3 | 3228.650839 | 3253.830794 | 6482.481633 |
print(results.summary())
SARIMAX Results
==========================================================================================
Dep. Variable: count No. Observations: 173
Model: SARIMAX(3, 1, 3)x(0, 1, [], 6) Log Likelihood -1586.377
Date: Wed, 01 Jun 2022 AIC 3186.755
Time: 11:04:59 BIC 3208.539
Sample: 0 HQIC 3195.597
- 173
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ar.L1 -0.6172 0.135 -4.588 0.000 -0.881 -0.354
ar.L2 -0.8422 0.104 -8.060 0.000 -1.047 -0.637
ar.L3 -0.1124 0.127 -0.883 0.377 -0.362 0.137
ma.L1 0.1486 0.087 1.699 0.089 -0.023 0.320
ma.L2 0.2004 0.079 2.536 0.011 0.046 0.355
ma.L3 -0.7823 0.081 -9.642 0.000 -0.941 -0.623
sigma2 1.429e+07 1.69e-09 8.46e+15 0.000 1.43e+07 1.43e+07
===================================================================================
Ljung-Box (L1) (Q): 0.06 Jarque-Bera (JB): 2.06
Prob(Q): 0.81 Prob(JB): 0.36
Heteroskedasticity (H): 0.82 Skew: 0.11
Prob(H) (two-sided): 0.47 Kurtosis: 3.50
===================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 3.64e+32. Standard errors may be unstable.
plot = results.plot_diagnostics()
# Plot mean SARIMA predictions
fig,ax = plt.subplots(1,1,figsize=(20,8))
plt.plot(bal2, label='original')
plt.plot(forecast.predicted_mean, label='SARIMAX', c="r")
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.xlabel('time')
plt.ylabel('Number of contracts')
plt.legend()
plt.grid(True)
plt.show()
# Plot train and test sets
plt.subplots(1,1,figsize=(20,7))
plt.plot(contract_train['count'],label='TRAIN (80%)')
plt.plot(contract_test['count'], label='TEST(20%)')
plt.legend(loc='best')
plt.xlabel('date')
plt.ylabel('count')
plt.title('Train and Test set')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
# Plot the number of contract using holtman forecast model using Double Exponential Smoothing
fig = plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(des_errors_df.index, des_errors_df['Predicted_Count'], label='Forecast')
plt.legend(loc='best')
plt.xlabel('date')
plt.ylabel('count')
plt.title('Forecast using Holt Winters-Double Exponential Smoothing')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
# Plot the number of contract using holtman forecast model using Triple Exponential Smoothing
fig= plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(tes_errors_df.index, tes_errors_df['Predicted_Count'], label='Forecast')
plt.legend(loc='best')
plt.xlabel('date')
plt.ylabel('count')
plt.title('Forecast using Holt Winters-Triple Exponential Smoothing')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
# Check the score for the train and test sets
print('Model Score at Train set: {:.2%}'.format(etr_model.score(X_train, y_train)))
print('Model Score at Test set: {:.2%}'.format(etr_model.score(X_test, y_test)))
Model Score at Train set: 100.00% Model Score at Test set: 72.51%
# Show the predictions for Extra Tree Regressor
fig = plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(etr_errors_df.index, etr_errors_df['Predicted_Count'], label='Forecast - ExtraTreesRegressor')
plt.legend(loc='best')
plt.xlabel('Date')
plt.ylabel('count')
plt.title('Forecast using ExtraTreesRegressor model')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
# Show the predictions for Linear Regression
fig = plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(lr_errors_df.index, lr_errors_df['Predicted_Count'], label='Forecast - Linear Regression')
plt.legend(loc='best')
plt.xlabel('Date')
plt.ylabel('count')
plt.title('Forecast using Linear Regression')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
plot
# Show the predictions for SARIMA model
fig = plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(sarima_test_df.index, sarima_test_df['Predicted_Count'], label='Forecast - SARIMA')
plt.legend(loc='best')
plt.xlabel('Date')
plt.ylabel('count')
plt.title('Forecast using SARIMA')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
# Show the errors for the predicted and the actual values
plt.figure(figsize=(14,5))
plt.plot(sarima_test_df.index, np.abs(sarima_test_df['Error']), label='errors')
plt.plot(sarima_test_df.index, sarima_test_df['count'], label='Actual Count')
plt.plot(sarima_test_df.index, sarima_test_df['Predicted_Count'], label='Predicted Count')
plt.legend(loc='best')
plt.xlabel('Date')
plt.ylabel('Count')
plt.title('Seasonal ARIMA (SARIMA) forecasts with actual count vs errors')
plt.xticks(sarima_test_df.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.show()
# Show predictions for Support Vector Regressor
fig = plt.figure(figsize=(14,5))
plt.plot(contract_train.index, contract_train['count'], label='Train')
plt.plot(contract_test.index, contract_test['count'], label='Test')
plt.plot(svr_errors_df.index, svr_errors_df['Predicted_Count'], label='Forecast - Support Vector Regressor')
plt.legend(loc='best')
plt.xlabel('Intervals')
plt.ylabel('Count')
plt.title('Forecast using Support Vector Regressor')
plt.xticks(bal2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.grid(True)
plt.show()
metrics_table
| Total_Count | Total_Pred_Count | Model_Overall_Error | MAE | RMSE | MAPE | |
|---|---|---|---|---|---|---|
| Modelname | ||||||
| Support Vector Regressor | 554100 | 591339.950117 | 37239.950117 | 1549.257795 | 2479.714193 | 9.785963 |
| Linear Regression | 554100 | 556990.797802 | 2890.797802 | 1569.672280 | 1956.473486 | 9.914912 |
| Holtman- TES | 554100 | 671226.045459 | 117126.045459 | 4614.475776 | 5460.132469 | 29.147564 |
| SARIMA | 554100 | 677672.762120 | -123572.762120 | 4182.641275 | 4993.732693 | 26.419860 |
| ExtreeTreesRegressor | 554100 | 558054.540000 | 3954.540000 | 1406.727429 | 1881.467556 | 8.885663 |
# Check of the model
print('Extra Tree Regressor')
print('Model Score at Train set: {:.2%}'.format(etr_model.score(X_train, y_train)))
print('Model Score at Test set: {:.2%}'.format(etr_model.score(X_test, y_test)))
Extra Tree Regressor Model Score at Train set: 100.00% Model Score at Test set: 72.51%
plt.figure(figsize=(4,4), dpi=1000)
plot_tree(dct,
feature_names=["ETA","ANNO","contratto_transformed","nazionalita_transformed","genere_transformed"],
filled=True,)
# Show Decision tree
plt.show()
# Probablistic prediction the values
y_pred_prob = dct.predict_proba(X_test)
y_pred_prob
array([[0.00862069, 0.00123153, 0. , ..., 0.00615764, 0.00123153,
0. ],
[0.00469484, 0. , 0. , ..., 0.08920188, 0.00469484,
0. ],
[0. , 0. , 0. , ..., 0.00571429, 0.00571429,
0. ],
...,
[0.00603136, 0.00120627, 0.00120627, ..., 0.00723764, 0.00120627,
0. ],
[0.00365408, 0. , 0. , ..., 0.00730816, 0.00121803,
0. ],
[0.00487211, 0. , 0. , ..., 0.00243605, 0. ,
0. ]])
# Predict the values
y_pred = dct.predict(X_test)
y_pred
array([25, 56, 43, ..., 25, 62, 46], dtype=uint8)
# Print the confusion matrix
confusion = metrics.confusion_matrix(y_test, y_pred)
print(confusion)
[[ 237 0 0 ... 2 80 0] [ 1 0 0 ... 1 0 0] [ 0 0 0 ... 0 1 0] ... [ 19 0 0 ... 2140 37 1] [ 75 0 0 ... 856 1235 147] [ 2 0 0 ... 2 3 0]]
# Show the confusion matrix
fig=px.imshow(normalized,color_continuous_scale='blues')
fig.show()
f, ax1 = plt.subplots(1,1,figsize=(15,5))
balanced2.plot(ax=ax1)
ax1.set_xlabel("time")
ax1.set_ylabel("Number of Contracts")
Text(0, 0.5, 'Number of Contracts')
# Calculate the Augmented Dickey-Fuller test can be used to test for a unit root in a univariate
#process in the presence of serial correlation.
results = adfuller(balanced2['count'])
print('ADF Statistic: %f' % results[0])
print('p-value: %f' % results[1])
ADF Statistic: -3.504387 p-value: 0.007875
fig, ax = plt.subplots(4, 1, figsize=(15, 6))
# Plot the series
decomposed_add.observed.plot(ax = ax[0])
decomposed_add.trend.plot(ax = ax[1])
decomposed_add.seasonal.plot(ax = ax[2])
decomposed_add.resid.plot(ax = ax[3])
# Add the labels to the Y-axis
ax[0].set_ylabel('')
ax[1].set_ylabel('Trend')
ax[2].set_ylabel('Seasonal')
ax[3].set_ylabel('Residual')
plt.tight_layout()
plt.show()
arima_df = arima_aic(balanced2)
arima_df
This problem is unconstrained. This problem is unconstrained. This problem is unconstrained.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 1 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.18383D+00 |proj g|= 0.00000D+00
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
1 0 1 0 0 0 0.000D+00 9.184D+00
F = 9.1838298314262747
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 2 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.08450D+00 |proj g|= 2.03286D-03
At iterate 5 f= 9.08449D+00 |proj g|= 1.26121D-05
At iterate 10 f= 9.08449D+00 |proj g|= 2.59348D-05
At iterate 15 f= 9.08449D+00 |proj g|= 2.55795D-05
At iterate 20 f= 9.08449D+00 |proj g|= 0.00000D+00
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
2 20 30 1 0 0 0.000D+00 9.084D+00
F = 9.0844865979947294
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.07312D+00 |proj g|= 1.91882D-03
At iterate 5 f= 9.07311D+00 |proj g|= 7.10543D-07
At iterate 10 f= 9.07311D+00 |proj g|= 2.30926D-06
At iterate 15 f= 9.07311D+00 |proj g|= 3.60600D-05
At iterate 20 f= 9.07311D+00 |proj g|= 4.08562D-06
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 24 42 1 0 0 3.553D-07 9.073D+00
F = 9.0731115171749259
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
This problem is unconstrained. This problem is unconstrained. This problem is unconstrained.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.07166D+00 |proj g|= 1.95630D-03
At iterate 5 f= 9.07164D+00 |proj g|= 3.55272D-07
At iterate 10 f= 9.07164D+00 |proj g|= 2.48690D-06
At iterate 15 f= 9.07164D+00 |proj g|= 5.41789D-05
At iterate 20 f= 9.07164D+00 |proj g|= 3.55271D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 21 28 1 0 0 1.776D-07 9.072D+00
F = 9.0716437664765763
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 2 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.13668D+00 |proj g|= 1.11218D-03
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
2 4 6 1 0 0 3.553D-07 9.137D+00
F = 9.1366800964879182
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.07632D+00 |proj g|= 2.06182D-03
At iterate 5 f= 9.07630D+00 |proj g|= 1.06581D-06
At iterate 10 f= 9.07630D+00 |proj g|= 2.25597D-05
At iterate 15 f= 9.07630D+00 |proj g|= 1.24345D-06
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 18 33 1 0 0 1.776D-07 9.076D+00
F = 9.0763026500350730
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.07252D+00 |proj g|= 1.85310D-03
At iterate 5 f= 9.07250D+00 |proj g|= 1.84741D-05
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 8 11 1 0 0 1.776D-07 9.073D+00
F = 9.0725032258534544
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 3 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.08715D+00 |proj g|= 2.28830D-03
At iterate 5 f= 9.08714D+00 |proj g|= 3.55272D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
3 5 8 1 0 0 3.553D-07 9.087D+00
F = 9.0871362631126509
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.06865D+00 |proj g|= 2.30198D-03
At iterate 5 f= 9.06863D+00 |proj g|= 1.26299D-04
At iterate 10 f= 9.06863D+00 |proj g|= 1.77636D-06
At iterate 15 f= 9.06863D+00 |proj g|= 1.98952D-05
At iterate 20 f= 9.06863D+00 |proj g|= 3.33955D-05
At iterate 25 f= 9.06863D+00 |proj g|= 1.77636D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 25 36 1 0 0 1.776D-07 9.069D+00
F = 9.0686291554762981
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
This problem is unconstrained. This problem is unconstrained. This problem is unconstrained.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 5 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.06863D+00 |proj g|= 2.33129D-03
At iterate 5 f= 9.06861D+00 |proj g|= 1.03384D-04
At iterate 10 f= 9.06861D+00 |proj g|= 1.42109D-06
At iterate 15 f= 9.06861D+00 |proj g|= 2.41585D-05
At iterate 20 f= 9.06861D+00 |proj g|= 2.25597D-05
At iterate 25 f= 9.06861D+00 |proj g|= 1.77636D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
5 26 41 1 0 0 1.776D-07 9.069D+00
F = 9.0686133041049555
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 4 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.06476D+00 |proj g|= 2.34870D-03
At iterate 5 f= 9.06474D+00 |proj g|= 8.88179D-07
At iterate 10 f= 9.06474D+00 |proj g|= 2.66454D-06
This problem is unconstrained. This problem is unconstrained.
At iterate 15 f= 9.06474D+00 |proj g|= 7.49623D-05
At iterate 20 f= 9.06474D+00 |proj g|= 1.11910D-05
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
4 23 32 1 0 0 1.776D-07 9.065D+00
F = 9.0647364237773278
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 5 M = 12
At X0 0 variables are exactly at the bounds
At iterate 0 f= 9.05508D+00 |proj g|= 3.69145D-03
At iterate 5 f= 9.05504D+00 |proj g|= 7.24754D-05
At iterate 10 f= 9.05504D+00 |proj g|= 7.10543D-07
At iterate 15 f= 9.05504D+00 |proj g|= 1.24345D-06
At iterate 20 f= 9.05504D+00 |proj g|= 8.88178D-06
At iterate 25 f= 9.05504D+00 |proj g|= 7.99361D-06
At iterate 30 f= 9.05504D+00 |proj g|= 4.97380D-05
At iterate 35 f= 9.05504D+00 |proj g|= 1.77636D-06
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
5 38 46 1 0 0 5.329D-07 9.055D+00
F = 9.0550443345253786
CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH
This problem is unconstrained.
| p | q | aic | bic | sum_aic_bic | |
|---|---|---|---|---|---|
| 0 | 0 | 0 | 2502.001714 | 2507.827024 | 5009.828738 |
| 4 | 1 | 0 | 2491.176986 | 2499.914951 | 4991.091937 |
| 8 | 2 | 0 | 2479.701064 | 2491.351683 | 4971.052747 |
| 10 | 2 | 2 | 2478.662819 | 2496.138748 | 4974.801567 |
| 6 | 1 | 2 | 2477.720877 | 2492.284152 | 4970.005029 |
| 3 | 0 | 3 | 2477.487104 | 2492.050379 | 4969.537483 |
| 1 | 0 | 1 | 2476.980355 | 2485.718319 | 4962.698674 |
| 5 | 1 | 1 | 2476.754321 | 2488.40494 | 4965.159261 |
| 9 | 2 | 1 | 2476.66713 | 2491.230405 | 4967.897535 |
| 2 | 0 | 2 | 2475.886333 | 2487.536952 | 4963.423285 |
| 12 | 3 | 0 | 2475.608307 | 2490.171582 | 4965.779889 |
| 13 | 3 | 1 | 2474.972059 | 2492.447988 | 4967.420047 |
plot = results.plot_diagnostics()
# Plot mean SARIMA predictions
fig,ax = plt.subplots(1,1,figsize=(20,8))
plt.plot(balanced2, label='original')
plt.plot(forecast.predicted_mean, label='SARIMAX', c="r")
plt.xticks(balanced2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.xlabel('time')
plt.ylabel('Number of contracts')
plt.title('Pre-covid Situation')
plt.legend()
plt.grid(True)
plt.show()
f, ax1 = plt.subplots(1,1,figsize=(15,5))
balanced_covid1.plot(ax=ax1)
ax1.set_xlabel("time")
ax1.set_ylabel("Number of Contracts")
Text(0, 0.5, 'Number of Contracts')
# Plot mean SARIMA predictions
fig,ax = plt.subplots(1,1,figsize=(20,8))
plt.plot(balanced2, label='Real values before COVID')
plt.plot(forecast.predicted_mean, label='SARIMAX prediction', c="r")
plt.plot(balanced_covid1, label='Real values after COVID', c='g')
plt.xticks(balanced2.index.unique())
plt.locator_params(axis='x', nbins=10)
plt.xlabel('time')
plt.ylabel('Number of contracts')
plt.title('COVID Situation (Checking)')
plt.legend()
plt.grid(True)
plt.show()